Phantom redundancy: a register transfer level technique for gracefully degradable data path synthesis
نویسندگان
چکیده
In this paper we present an area-efficient register transfer level technique for gracefully degradable data path synthesis called phantom redundancy. In contrast to spare-based approaches, phantom redundancy is a recovery technique that does not use any standby spares. Phantom redundancy uses extra interconnect to make the resulting data path reconfigurable in the presence of any (single) functional unit failure. When phantom redundancy is combined with a concurrent error detection technique, concurrent error detection followed by reconfiguration is automatic. We developed a register transfer level synthesis algorithm that incorporates phantom redundancy constraints. There is a tight interdependence between reconfiguration of a (faulty) data path and scheduling and operation-to-operator binding tasks during register transfer level synthesis. We developed a genetic algorithm based register transfer level synthesis approach to incorporate phantom redundancy constraints. The algorithm minimizes the performance degradation of the synthesized data path in the presence of any single faulty functional unit. The effectiveness of the technique and the algorithm are illustrated using high level synthesis benchmarks. Contact information: [email protected]; tel: 718 260 3596; fax: 718 260 3906 1.0 Introduction Advances in VLSI have made it possible to implement complex algorithms on a single integrated circuit (IC) with the attendant advantages of reduced power consumption, higher reliability and reduced size and weight. While increasing device densities have made it possible to implement such complex VLSI systems, they have also rendered the ICs highly susceptible to a variety of fabrication-time fault mechanisms. In many VLSI applications, it is not uncommon to experience circuit yields on the order of 10% or even less thereby increasing the cost of manufacturing the circuit. A number of researchers have examined fabrication-time reconfiguration approaches to enhance the yield of ICs. These techniques identify failed functional units in a fabricated IC and program the wires to reconfigure the fault free functional units into a working IC. Built-In-Self-Repair (BISR) is a popular reconfiguration technique. BISR approaches have been applied mostly to regular architectures such as memory [1]. In BISR, reconfiguration is realized by providing a set of spare modules in addition to the core operational modules [1]. In this paper we present a register transfer level technique for reconfiguration of ICs called phantom redundancy that does not use spare modules. Rather, phantom redundancy uses redundant programmable interconnect. When a functional unit is faulty, the interconnection network in the data path is reprogrammed to configure the fault-free functional units into an operational data path albeit with a degraded performance. Phantom redundancy is applicable to both regular and non-regular data paths and entails small area overhead. Phantom redundancy does not perform CED. When combined with a concurrent error detection (CED) and faulty unit location technique such as introspection [6], phantom redundancy can be used for dynamic reconfiguration in the field. 1.1 Related Research VLSI reconfiguration techniques have been developed to make regular processor arrays tolerant to faults occuring during operation. Using a spare row (column) of processing elements, Negrini et. al. developed a rippling replacement strategy [14]. A faulty module is replaced with its neighbor in the same row (column). When both a spare row and a spare column are available the fault stealing strategy can be used. In fault stealing, a faulty module is replaced with a neighbor either in the same row or in the same column [14]. When multiple spare rows and columns are present a repair-most strategy can be used [15]. Repair-most strategy is based on a graph theoretic formulation and bipartite matching approach. An RT level reconfigurable data path synthesis technique based on spare functional units called built-in-self-repair (BISR) has been proposed by Guerra et. al. [3]. Instead of one spare module for each active module, BISR uses one spare module for each module type. All of these approaches use spare modules. Tolerance to IC fabrication process related defects can be improved using two techniques. Tuning the process parameters can reduce such fabrication time defects in the device [19]. However, such process yield maximization does not totally eliminate the fabrication-related defects. Along an orthogonal dimension, defect-tolerant circuit design and layout techniques can maximize the circuit yield. While Chiluvuri and Koren [20] developed layout compaction algorithms to maximize defect-tolerance, Allan et. al. [21] proposed selective relaxation of the layout design rules. Phantom redundancy complements these layout level defect-tolerance. RT level synthesis techniques for area optimal [8,9], performance optimal [10,11] and power optimal data path design have been explored [22,24]. RT level data path synthesis targeting off-line testability [23,26] and on-line testability [2,3,4,5,7,12] has also been addressed. In [2,5] it has been shown that recovery from transient faults can be done efficiently at RT level by checkpointing and roll back in hardware. Before, rollback based recovery or reconfiguration can be carried out, the faulty unit should be identified. Concurrent error detection (CED) and faulty unit location are hence important. Straightforward duplication entails significant area overhead. RT level techniques for area-efficient CED based on fault security were developed in [4,7]. RT level techniques using spare capacity in a design have also been proposed [6]. RT level reconfigurable data path synthesis technique using spare functional units has been proposed by Guerra et. al. [3]. On-line testable controller unit synthesis has been reported in [12]. The proposed technique can be used in combination with these CED techniques. 1.2 Issues in Gracefully Degradable Data path Synthesis 1.2.1 The Design Methodology We propose to incorporate phantom redundancy reconfiguration constraints within a topdown VLSI design methodology. From among the various levels of abstraction in such a VLSI design methodology, the register transfer (RT) level is the right abstraction at which to incorporate phantom redundancy. This is because: 1. there is a tight interdependence between the synthesized data path and the reconfiguration of such a data path, 2. the fault model is at the RT level of functional units, and 3. data for reconfiguration such as the clock-by-clock schedule and operation-tooperator binding can be easily obtained at the RT level. RT level synthesis involves (I) translation of a high-level algorithmic description into an intermediate representation called the Control Data Flow Graph (CDFG), (ii) assignment of operations in the CDFG to clock cycles (scheduling), (iii) mapping the scheduled operations onto available functional units (binding) and (iv) synthesis of the control unit. It has been shown that scheduling and binding are NP-hard [13]. Besides, scheduling and binding are interdependent. Hence numerous heuristics have been proposed to solve these problems [9]. Most RT level synthesis systems solve scheduling and binding independent of each other. Since the synthesized architecture profoundly influences its reconfigurability, it should be integrated manner with the other synthesis tasks. In this paper we developed a genetic algorithm [18] based technique to solve the simultaneous scheduling, binding and reconfiguration problem. The schedule and binding in the presence of any single functional unit failure is constructed simultaneously. This yields an RT level data path with a minimal degradation in performance. 1.2.2 Controller Issues In a gracefully degradable data path the control unit is important since it orchestrates the reconfiguration. There are two viable options for designing a controller for reconfiguration. 1. Programmable Controllers: Although programmable controllers suffer from the disadvantage of slightly larger silicon area for implementation and a slightly lower performance, they have a major advantage in terms of ease of reconfiguration. Even in the absence of faults in the system the extra interconnect and the controller programmability gives the user the option to implement new CDFGs on the architecture much more efficiently. 2. Composed Controllers: The controller for operating the fault free data path is composed with the controllers for each of the single unit failure scenarios. Although these composed controllers are smaller in size and faster they are hardwired. 1.3 Research Contributions The important contributions of this paper are: 1. Phantom Redundancy: we present an area efficient technique for data path reconfiguration. Phantom redundancy adds extra programmable interconnect to make the resulting data path reconfigurable in the presence of functional unit failures. 2. Integrating reconfiguration constraints with scheduling and binding: We developed a genetic-algorithm-based global optimization approach for the synthesis of area-efficient gracefully degradable data paths. This is because the problem of reconfiguring a data path with minimal area overhead strongly depends on the original data path. The algorithm performs simultaneous scheduling, binding and reconfiguration to minimize the performance degradation in the presence of a functional unit failure. The reported technique is applicable to regular array architectures and non-regular data path based designs. 2.0 Phantom Redundancy Phantom redundancy is an area-efficient approach to implement gracefully degradable data paths. Phantom redundancy uses additional interconnections and yields gracefully degradable data paths with low hardware overhead. Upon detecting a faulty functional unit, the interconnection network is programmed to perform the intended function on the fault-free functional units albeit at a reduced throughput. Phantom redundancy can be used for fabrication-time and real-time reconfiguration of data paths. This capability is crucial in military and space applications where replacement of a faulty module is either impossible or prohibitively expensive. Towards illustrating and clarifying the concept of phantom redundancy, consider a CDFG consisting of six operations a, b, c, d, e, f shown in Figure 1. Assuming that all operations are of the same type and no back-to-back chaining is allowed, the fastest schedule requiring two clock cycles and four functional units is shown in Figure 1 (a). The redundant interconnect shown as dotted lines in Figure 1 (b) make this data path gracefully degradable in the presence of any single functional unit failure. Figure 1: (a) Scheduled CDFG. (b) Data path implementing the CDFG. Two additional point-to-point links shown as dotted lines make the data path reconfigurable. Upon identifying a faulty functional unit, the controller can be reprogrammed to operate the reconfigured data path with a degraded performance. For example, if functional unit F1 is faulty, operations a, c bound to it in the original data path are mapped to the faultfree functional unit F3 as shown in Figure 2 (a). Further, operation b is remapped to functional unit F4. This reconfigured data path operates at a degraded performance of 4 clock cycles (as opposed to 2 clock cycles in the fault-free data path). Figure 2: reconfiguration in the presence of faulty functional units The corresponding schedule is also shown in Figure 2 (a). This data path can tolerate all single functional unit failures (see Figure 2 (a,b)). This data path does not use any spare modules but uses two additional interconnections. This data path can also tolerate 50% of all two-unit faults ((F1, F2) and (F3, F4)). For all these scenarios the reconfigured data path consumes twice as many clock cycles as the fault-free data path. Consider another CDFG consisting of fifteen add operations (a1,..,and a15) as shown in Figure 3 (a). The schedule shown here uses three adders (A0, A1, A2). One possible operation-to-operator binding is shown in Figure 3. The functional unit on which an operation is carried is shown in capital letters. In Figure 3 (a), node a14 is scheduled in clock cycle 4 and is executed on adder A2. Until now scheduling and binding did not account for possible performance degradation in the presence of an adder failure. Arbitrarily reallocating the responsibilities of the failed unit among the fault free units increases the complexity of the interconnection network. Consequently, we propose to reallocate the responsibilities of a failed unit to a single backup unit.
منابع مشابه
Algorithm level recomputing using allocation diversity: a registertransfer level approach to time redundancy-based concurrent errordetection
In this paper, the authors propose an algorithm-level time redundancy-based concurrent error detection (CED) scheme against permanent and transient faults by exploiting the hardware allocation diversity at the register transfer level. Although the normal computation and the recomputation are carried out on the same data path, the operation-to-operator allocation for the normal computation is di...
متن کاملOrthogonal Scan: Low-Overhead Scan for Data Paths
Orthogonal scan paths, which follow the path of the data flow, can be used in data path designs to reduce the test overhead — area, delay and test application time — by sharing functional and test logic. Orthogonal scan paths are orthogonal to traditional scan paths. Judicious ordering of the registers in the orthogonal scan path can allow the scan path to be implemented entirely with existing ...
متن کاملScheduling Verification in High-Level Synthesis - Implementation of a Normalizer and a Code Motion Verifier
High level synthesis is the process of generating the register transfer level (RTL) design from the behavioral description. The synthesis process consists of several interdependent phases: Preprocessing, Scheduling, Register Allocation and Binding of variables, Control Path and Data Path generation, and Generation of synthesizable Verilog code (RTL). A High-level synthesis tool, called Structur...
متن کاملAutomated Correctness Condition Generation for Formal Verification of Synthesized RTL Designs
High-level synthesis tools generate register-transfer level designs from algorithmic behavioral speciications. High-level synthesis process typically consists of dependency graph scheduling, functional unit allocation, register allocation, interconnect allocation and controller generation tasks. Widely used algorithms for these tasks retain the overall control ow structure of the behavioral spe...
متن کاملDesign for hierarchical testability of RTL circuits obtained by behavioral synthesis
Most behavioral synthesis and design for testability techniques target subsequent gate-level sequential test generation, which is frequently incapable of handling complex controller/data path circuits with large data path bit-widths. Hierarchical testing attempts to counter the complexity of test generation by exploiting information from multiple levels of the design hierarchy. We present techn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. on CAD of Integrated Circuits and Systems
دوره 21 شماره
صفحات -
تاریخ انتشار 2002